The presence of mixed type texts in a document image is an important obstacle towards the automation of the optical character recognition procedure. Machine printed character recognition and handwritten character recognition techniques
نویسندگان
چکیده
552 Abstract—In many documents such as admission form, bank cheques, memorandums, letters and application forms machine printed and handwritten characters are mixed. Since the algorithms for recognition of machine-printed texts and handwritten texts are different, it is necessary to distinguish between these two types of texts before giving it to respective OCR systems to process it. This separation will definitely increase the performance and overall system quality. The paper discusses some observations about characteristics of these two types of texts and various techniques of separation of machine printed and handwritten text into three categories (Structural and statistical features, Gradient features and Geometric features) based on feature extraction method.
منابع مشابه
Neural Network Based Recognition System Integrating Feature Extraction and Classification for English Handwritten
Handwriting recognition has been one of the active and challenging research areas in the field of image processing and pattern recognition. It has numerous applications that includes, reading aid for blind, bank cheques and conversion of any hand written document into structural text form. Neural Network (NN) with its inherent learning ability offers promising solutions for handwritten characte...
متن کاملOff-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model
In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...
متن کاملDistinction between Machine Printed Text and Handwritten Text in a Document
In many documents machine printed& handwritten texts are intermixed .Optical Character Recognition (OCR) techniques are different for machine printed and handwritten text, so it is necessary to separate these text before giving input to the OCR. In this paper we are proposing methodology for Hindi language. This methodology is based on structural features of text. Experimental results on a data...
متن کاملZone Based Features for Handwritten and Printed Mixed Kannada Digits Recognition
In the field of Optical Character Recognition (OCR), zoning is used to extract topological information from patterns. In this paper we propose Zone based features for recognition of the mixer of Handwritten and Printed Kannada Digits. A digit image is divided into 64 zones and pixel density is computed for each zone. This procedure is sequentially repeated for entire zone. Finally 64 features a...
متن کاملMachine-printed and hand-written text lines identification
There are many types of documents where machine-printed and handwritten texts intermixedly appear. Since the optical character recognition (OCR) methodologies for machine-printed and handwritten texts are dierent, to achieve optimal performance it is necessary to separate these two types of texts before feeding them to their respective OCR systems. In this paper, we present a machine-printed a...
متن کامل